Skip to content

feat: build & test relayflows core against @agent-relay 8.x#2

Merged
willwashburn merged 2 commits into
mainfrom
feat/sdk-v8-migration
Jun 5, 2026
Merged

feat: build & test relayflows core against @agent-relay 8.x#2
willwashburn merged 2 commits into
mainfrom
feat/sdk-v8-migration

Conversation

@willwashburn

@willwashburn willwashburn commented Jun 4, 2026

Copy link
Copy Markdown
Member

User description

Summary

Brings packages/core to a compiling, building state for the first time in this repo (it was imported from the relay monorepo mid-migration and never built here), migrates it off the removed v7 @agent-relay/sdk broker API onto @agent-relay/harness-driver 8.2, and rebrands the primitives to the @relayflows scope.

Depends on @agent-relay/harness-driver@8.2.0 (the SpawnedAgentHandle lifecycle primitive, added in AgentWorkforce/relay#1050).

Primitives

  • Rename @agent-relay/{github,slack,browser}-primitive@relayflows/*, aligned to the 0.1.0 lineage. Updates core deps, imports, README, and the publish workflow's release notes + build order.

Core SDK migration (v7 → v8)

  • Broker surface swapped from the removed v7 AgentRelay to HarnessDriverClient: construction via HarnessDriverClient.spawn({ onStderr }); the 7 legacy named listeners rewritten as a single onEvent(BrokerEvent) dispatcher (the named event bus isn't fed in direct-client usage); spawnPty, listAgents/release, shutdown; idle-nudge via sendMessage.
  • Consumes the new SpawnedAgentHandle (waitForExit/waitForIdle/exitCode) through a thin local WorkflowAgentHandle wrapper that preserves the runner's existing string contract (so waitForExitWithIdleNudging is unchanged).
  • Vendored the deleted CLI utilities (getCliDefinition/resolveCliSync/resolveSpawnPolicy) into core/src/cli-registry.ts; stripAnsistrip-ansi.
  • Bumped @agent-relay/{cloud,config,sdk,harness-driver,harnesses} to ^8.2.0; unified the repo on one SDK major (dropped browser-primitive's vestigial 7.1.1 pin).

core + cli typecheck clean and build.

Tests — ⚠️ in progress (why this is a draft)

  • ✅ Repaired monorepo-era import paths (../workflows/*, ../../provisioner/*) and the builtin-templates path (yaml-validation: 0 → 143 passing).
  • ✅ Established + applied the HarnessDriverClient mock pattern: workflow-runner.test.ts 0 → 59/65, via a BrokerEvent-translating emitMockEvent shim + fail-fast reconciliation (the runner now defaults to strategy:'retry', so first-pass failure tests opt into fail-fast).
  • 🔧 Remaining: migrate the same mock shape across ~11 more broker-mock test files (they currently spawn real brokers), and reconcile the supervised/owner/review behavioral assertions to the current runner. These were broken on extraction from the monorepo, independent of the SDK change.

See docs/sdk-v8-migration-plan.md for the full plan and findings.

🤖 Generated with Claude Code


Summary by cubic

Migrated packages/core from v7 @agent-relay/sdk to @agent-relay/harness-driver@8.2, rebranded primitives to @relayflows/*, and brought the core test suite to green.

  • Refactors

    • Swapped v7 broker API for HarnessDriverClient with a single onEvent(BrokerEvent); preserved runner behavior via a WorkflowAgentHandle adapter over SpawnedAgentHandle.
    • Vendored CLI helpers (getCliDefinition, resolveCliSync, resolveSpawnPolicy) into core/src/cli-registry.ts; switched to strip-ansi. Updated integrations to @relayflows/* and publish order to build primitives first.
    • Minor cleanup: ignore **/.agent-relay/ in VCS and validate script file extensions before existence checks.
  • Tests

    • Repaired and migrated the suite to the new driver: mock HarnessDriverClient, use the runner’s executor seam to avoid real spawns, and reconcile to { reason } results and fail-fast behavior. Full suite passing.
    • Added vitest configs with a plugin to externalize the node:sqlite builtin and capped fork concurrency; updated trajectory path expectations and permission-audit path via @agent-relay/cloud’s getDefaultPermissionAuditPath.

Written for commit 4c32182. Summary will update on new commits.

Review in cubic


CodeAnt-AI Description

Move core to the v8 agent runtime and rename the primitives to @relayflows

What Changed

  • Core now uses the v8 harness driver for agent spawning, lifecycle tracking, idle nudges, and broker events.
  • The workflow runner keeps the same user-facing behavior while handling the new agent handle format behind the scenes.
  • GitHub, Slack, and Browser primitives were renamed to the @relayflows/* package scope, and the publish workflow now builds them before core.
  • Workflow tests were updated to match the new runtime behavior, including fail-fast cases, review decisions, idle nudges, and trajectory file locations.

Impact

✅ Workflows run against the v8 agent runtime
✅ Fewer breakages from renamed packages
✅ Clearer workflow test coverage for retry, review, and idle behavior

💡 Usage Guide

Checking Your Pull Request

Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.

Talking to CodeAnt AI

Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:

@codeant-ai ask: Your question here

This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.

Example

@codeant-ai ask: Can you suggest a safer alternative to storing this secret?

Preserve Org Learnings with CodeAnt

You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:

@codeant-ai: Your feedback here

This helps CodeAnt AI learn and adapt to your team's coding style and standards.

Example

@codeant-ai: Do not flag unused imports.

Retrigger review

Ask CodeAnt AI to review the PR again, by typing:

@codeant-ai: review

Check Your Repository Health

To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.

Brings packages/core to a compiling, building state for the first time in
this repo (it was imported mid-migration and never built here), and rebrands
the primitives to the @relayflows scope.

Primitives:
- Rename @agent-relay/{github,slack,browser}-primitive -> @relayflows/*,
  aligned to the 0.1.0 lineage; update core deps, imports, README, and the
  publish workflow's release notes + build order.

Core SDK migration (v7 -> v8):
- Swap the broker surface from the v7 `AgentRelay` (removed API) to
  `@agent-relay/harness-driver`'s `HarnessDriverClient`: construction via
  `HarnessDriverClient.spawn({ onStderr })`, the 7 legacy named listeners
  rewritten as a single `onEvent(BrokerEvent)` dispatch, `spawnPty`,
  `listAgents`/`release`, `shutdown`, and idle-nudge via `sendMessage`.
- Consume the new `SpawnedAgentHandle` (waitForExit/waitForIdle/exitCode,
  added upstream in harness-driver 8.2) through a thin local
  WorkflowAgentHandle wrapper that preserves the runner's string contract.
- Vendor the deleted CLI utilities (getCliDefinition/resolveCliSync/
  resolveSpawnPolicy) into core/src/cli-registry.ts; stripAnsi -> strip-ansi.
- Bump @agent-relay/{cloud,config,sdk,harness-driver,harnesses} to ^8.2.0 and
  unify the repo on a single SDK major (drop browser-primitive's vestigial
  7.1.1 pin).

core + cli typecheck clean and build.

Tests (in progress): repaired monorepo-era import paths (../workflows/* and
../../provisioner/*), fixed the builtin-templates path (yaml-validation:
0 -> 143), and migrated workflow-runner.test.ts to the HarnessDriverClient
mock with fail-fast reconciliation (now 59/65). Remaining broker-mock files
and supervised-flow behavioral reconciliation are still to do.

See docs/sdk-v8-migration-plan.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@gemini-code-assist

Copy link
Copy Markdown

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@coderabbitai

coderabbitai Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

This PR migrates relayflows from @agent-relay/sdk v7 to v8, rebranding primitive packages to @relayflows/*, introducing a WorkflowAgentHandle adapter, vendoring CLI utilities locally, and refactoring the core runner for harness-driver broker integration with comprehensive test rewiring.

Changes

Core SDK v8 migration and architecture update

Layer / File(s) Summary
Package metadata updates and dependency alignment
packages/browser-primitive/package.json, packages/core/package.json, packages/github-primitive/package.json, packages/slack-primitive/package.json
Primitive packages renamed to @relayflows/* at v0.1.0 (from @agent-relay/* v7.1.1); core dependencies upgraded to @agent-relay/* v8.2.0 with new @agent-relay/harness-driver addition.
WorkflowAgentHandle adapter for harness-driver compatibility
packages/core/src/agent-handle.ts
New WorkflowAgentHandle wrapper adapts v8 SpawnedAgentHandle lifecycle results ({ reason } objects) to runner's string contract ('exited' | 'idle' | 'timeout').
CLI utility vendoring into local registry
packages/core/src/cli-registry.ts, packages/core/src/process-spawner.ts, packages/core/src/proxy-env.ts, packages/core/src/channel-messenger.ts
Consolidated cli-registry.ts provides CLI definitions, resolution (async/sync with memoization), and spawn-policy construction; updated consumers to import locally instead of from SDK; process-spawner gains onChunk callback and early-completion handling.
Builder and run option type updates
packages/core/src/builder.ts, packages/core/src/run.ts
WorkflowRunOptions.relay retyped from AgentRelayOptions to RuntimeSpawnOptions to align with v8 broker API.
Core runner broker and agent lifecycle migration
packages/core/src/runner.ts
Major refactor from AgentRelay listener hooks to unified HarnessDriverClient.onEvent dispatcher; spawnPty constructs WorkflowAgentHandle; stale-agent cleanup via .release(name); improved owner-timeout precedence; evidence-based review decision parsing; idle nudging via .sendMessage().
Test infrastructure and mock rewiring
packages/core/src/__tests__/*.test.ts
Harness-driver mocking with makeMockHandle() PTY helpers; centralized eventListeners registry bridging legacy event names to broker event shapes; test paths aligned to new module structure; event isolation via beforeEach cleanup; many tests set errorHandling.strategy: 'fail-fast' for deterministic behavior.
Integration imports update to new primitive scopes
packages/core/src/integrations/browser.ts, packages/core/src/integrations/github.ts, packages/core/src/integrations/slack.ts
Browser, GitHub, and Slack integrations updated to import from @relayflows/* scoped packages.
Documentation, workflow, and configuration updates
.github/workflows/publish.yml, .gitignore, docs/sdk-v8-migration-plan.md, packages/core/vitest.config.ts, packages/github-primitive/README.md
Migration plan added with workstreams (WS-0 to WS-5) and effort estimate; release workflow reordered to build primitives first and release notes updated; .gitignore extended for npm cache and runtime directories; vitest configuration and node:sqlite plugin added; GitHub primitive README updated.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

"🐰 In SDK eight, our runner takes flight,
With harness drivers shining bright,
Primitives rebranded with relayflows' name,
Agent handles adapted—same dance, different frame!"

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 43.90% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: build & test relayflows core against @agent-relay 8.x' accurately reflects the main objective of migrating core to v8 and establishing a compiling/testable state.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The PR description comprehensively describes the migration from v7 @agent-relay/sdk to v8 @agent-relay/harness-driver, rebranding of primitives, vendor changes, and test updates.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/sdk-v8-migration

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@willwashburn willwashburn marked this pull request as ready for review June 4, 2026 16:48
@gemini-code-assist

Copy link
Copy Markdown

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: dda4aafc48

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +6739 to +6744
this.log(`[${step.name}] Spawning ${agentDef.cli} (pty)`);
agent = new WorkflowAgentHandle(
await this.relay.spawnPty({
...(spawnOptions as Record<string, unknown>),
cli: agentDef.cli,
} as Parameters<AgentRelay['spawnPty']>[0]);
}
} as SpawnPtyInput)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve headless spawning for OpenCode agents

When agentDef.cli is opencode, this path now always calls spawnPty and even logs (pty), whereas the previous SDK-spawner path requested the OpenCode headless runtime. I checked the Agent Relay TypeScript SDK reference, which says to use runtime: 'headless' for app-server sessions such as OpenCode; with this change any workflow using an OpenCode agent is launched through the PTY transport instead of the supported headless transport, so those steps can fail or hang rather than execute normally.

Useful? React with 👍 / 👎.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
packages/core/src/runner.ts (2)

6789-6803: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Hub nudge lookup is using logical names against a runtime-name map.

activeAgentHandles is populated with spawned runtime names, but resolveHubForNudge() probes it with agentDef.name from config. In supervised flows those names differ (<step>-owner-<runId>, etc.), so hub-mediated nudges never find the live coordinator and always fall back to the direct path. Resolve this via the logical-name metadata in runtimeStepAgents/supervisedRuntimeAgents, or search the live handles by logical agent identity instead of map key.

Also applies to: 6845-6845, 7219-7243

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/core/src/runner.ts` around lines 6789 - 6803, The hub nudge lookup
fails because resolveHubForNudge() probes activeAgentHandles by config
agentDef.name (logical name) while activeAgentHandles keys are spawned runtime
names; update resolveHubForNudge (and its call sites around
resolveSignalParticipantKind/rememberStepSignalSender) to resolve runtime handle
by using the logical-name metadata stored in runtimeStepAgents and
supervisedRuntimeAgents (or fall back to scanning activeAgentHandles values for
matching logicalName/options.logicalName) and then use that runtime name to
query activeAgentHandles; ensure calls that currently pass agentDef.name pass
the logical identifier and the resolver performs the reverse lookup to return
the correct runtime handle for hub-mediated nudges.

6070-6076: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Only finalize review early on an explicit REVIEW_DECISION line.

onChunk() now runs the full heuristic parser on partial output and immediately releases the reviewer handle when it sees a match. That makes the new evidence fallback fail-open: a chunk like looks good so far... or an echoed safe handoff phrase can be locked in as approved before the final verdict arrives. Keep the streaming path strict (REVIEW_DECISION: only), and apply the evidence/indecision fallbacks only after the reviewer exits.

Suggested direction
         onChunk: ({ chunk }) => {
           const nextOutput = reviewOutput + WorkflowRunner.stripAnsi(chunk);
           reviewOutput = nextOutput;
-          const parsed = this.parseReviewDecision(nextOutput);
+          const parsed = this.parseStrictReviewDecision(nextOutput);
           if (parsed) {
             startReviewCompletion(parsed);
           }
         },

Also applies to: 6121-6129, 6279-6313

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/core/src/runner.ts` around lines 6070 - 6076, The stream handler
currently runs the full heuristic parser on every partial chunk (in onChunk) and
may call startReviewCompletion too early; change onChunk so it only triggers
parseReviewDecision/startReviewCompletion when the chunk contains an explicit
REVIEW_DECISION: token (e.g., test for the literal prefix or regex like
/^\s*REVIEW_DECISION:/ before calling this.parseReviewDecision), leaving
heuristic/evidence and indecision fallbacks to run only after the reviewer
process exits; update the analogous handlers that call
this.parseReviewDecision/startReviewCompletion (the other occurrences noted near
the blocks at 6121-6129 and 6279-6313) to follow the same rule.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/sdk-v8-migration-plan.md`:
- Around line 94-101: The docs sections (WS-1/WS-4) still describe the old
AgentRelay/messaging APIs and conflict with the implemented runner direction;
update these sections to either (a) replace all references to AgentRelay option
changes with the actual HarnessDriverClient lifecycle APIs (and update mentions
in builder.ts, run.ts, runner.ts references) or (b) clearly mark the AgentRelay
text as a historical draft and add a short “historical vs final approach” note
that points to HarnessDriverClient; specifically revise the wording around
AgentRelay, AgentRelayOptions, and where env/brokerName/requestTimeoutMs map
(now handled via HarnessDriverClient/SpawnRuntimeInput, broker transport config,
and retryPolicy) and ensure the cross-references in builder.ts, run.ts, and
runner.ts are consistent with the final approach.
- Around line 41-43: The event-surface guidance is wrong: update the table
entries that show session-style named listeners (e.g., the
`addListener('workerOutput'|'messageReceived'|...)` row) to state that direct
clients consume a unified BrokerEvent via `onEvent(BrokerEvent)` instead of
remapped session listeners; also clarify that `addListener<K extends keyof
HarnessDriverEvents>(event, handler)` remains for driver-side hooks but is not
the client-facing model, and change the `onBrokerStderr(cb)` note to reflect the
`onStderr` construction option; apply the same corrections to the duplicated
section later in the doc where the session listener mapping is repeated.
- Around line 3-5: Update the migration plan's metadata to reflect the merged
v8.2 state: change the status text on the current "Status: proposed, not
started." line to indicate the migration is complete (e.g., "Status: merged /
completed") and update all version references currently showing "8.0.4" and
"8.1.2" to the current dependency range "^8.2.0"; apply the same edits to the
other occurrences referenced (including the block around lines 167-181) so the
document consistently reflects core being migrated and relay deps at ^8.2.0.

In `@packages/core/src/__tests__/resume-fallback.test.ts`:
- Around line 118-119: The test currently mocks legacy ../relay.js but
WorkflowRunner initializes this.relay via HarnessDriverClient.spawn (so broker
events use relay.onEvent/connectEvents) and requires a broker for agent steps
during runner.resume; replace the vi.mock('../relay.js', ...) with
vi.mock('`@agent-relay/harness-driver`', ...) and provide a mock object matching
the HarnessDriverClient surface used by the runner: export a spawn that returns
an object with onEvent, connectEvents, spawnPty (and any minimal behavior needed
for tests), ensuring WorkflowRunner (created by importing ../runner.js) gets
this mocked HarnessDriverClient so requiresBroker and relay event dispatch work
in the test.

In `@packages/core/src/__tests__/start-from.test.ts`:
- Around line 121-122: The test currently mocks the legacy relay API but not the
new harness-driver contract used by WorkflowRunner; update start-from.test.ts to
mock '`@agent-relay/harness-driver`' so that HarnessDriverClient.spawn(...) is
replaced with a test stub that returns an object matching the broker used by
WorkflowRunner (i.e., implement spawn to return an object with an onEvent method
and any other methods WorkflowRunner expects), remove or stop relying on legacy
../relay.js/addListener mocks, and ensure the test triggers relay events by
calling the stubbed broker.onEvent handlers so the start-from path exercises the
real broker lifecycle APIs used by WorkflowRunner and this.relay.onEvent.

In `@packages/core/src/__tests__/workflow-runner.test.ts`:
- Around line 72-84: The tests mock for spawnPty currently returns bare strings
from waitForExit/waitForIdle but WorkflowAgentHandle expects an object shape {
reason }; update the mock so waitForExit and waitForIdle resolve to an object
with the reason field (e.g., { reason: <value> }) — adjust the spawnPty mock or
replace its return with the existing makeMockHandle(name) factory (which exposes
waitForExit/waitForIdle that resolve to { reason }) or wrap
waitForExitFn/waitForIdleFn results before returning so all callers receive {
reason } instead of a raw string.

In `@packages/core/src/runner.ts`:
- Around line 4342-4346: The getFailureResult() branches are unconditionally
printing error.stack when the message matches /reading 'get'/ which leaks
stacks; remove the console.error('DEBUG_STACK>>>', error.stack) calls inside
those inline error serialization blocks (the getFailureResult implementations)
and instead rely on the existing RF_DEBUG_STACK gated debug logging used in the
surrounding catch handlers; if you need to preserve stack output here, guard it
with the same env check (process.env.RF_DEBUG_STACK) or call the centralized
debug logger rather than printing unconditionally. Also remove the identical
unconditional stack dump in the second failure-serialization branch referenced
in the comment (the other getFailureResult-like block) so both locations use the
env-gated debug path.

---

Outside diff comments:
In `@packages/core/src/runner.ts`:
- Around line 6789-6803: The hub nudge lookup fails because resolveHubForNudge()
probes activeAgentHandles by config agentDef.name (logical name) while
activeAgentHandles keys are spawned runtime names; update resolveHubForNudge
(and its call sites around
resolveSignalParticipantKind/rememberStepSignalSender) to resolve runtime handle
by using the logical-name metadata stored in runtimeStepAgents and
supervisedRuntimeAgents (or fall back to scanning activeAgentHandles values for
matching logicalName/options.logicalName) and then use that runtime name to
query activeAgentHandles; ensure calls that currently pass agentDef.name pass
the logical identifier and the resolver performs the reverse lookup to return
the correct runtime handle for hub-mediated nudges.
- Around line 6070-6076: The stream handler currently runs the full heuristic
parser on every partial chunk (in onChunk) and may call startReviewCompletion
too early; change onChunk so it only triggers
parseReviewDecision/startReviewCompletion when the chunk contains an explicit
REVIEW_DECISION: token (e.g., test for the literal prefix or regex like
/^\s*REVIEW_DECISION:/ before calling this.parseReviewDecision), leaving
heuristic/evidence and indecision fallbacks to run only after the reviewer
process exits; update the analogous handlers that call
this.parseReviewDecision/startReviewCompletion (the other occurrences noted near
the blocks at 6121-6129 and 6279-6313) to follow the same rule.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: de9e1424-4786-4a36-bca2-33414ec0b439

📥 Commits

Reviewing files that changed from the base of the PR and between ab6bab3 and dda4aaf.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (31)
  • .github/workflows/publish.yml
  • .gitignore
  • docs/sdk-v8-migration-plan.md
  • packages/browser-primitive/package.json
  • packages/core/package.json
  • packages/core/src/__tests__/completion-pipeline.test.ts
  • packages/core/src/__tests__/e2e-owner-review.test.ts
  • packages/core/src/__tests__/e2e-permissions.test.ts
  • packages/core/src/__tests__/error-scenarios.test.ts
  • packages/core/src/__tests__/idle-nudge.test.ts
  • packages/core/src/__tests__/permission-types.test.ts
  • packages/core/src/__tests__/permissions-integration.test.ts
  • packages/core/src/__tests__/resume-fallback.test.ts
  • packages/core/src/__tests__/start-from.test.ts
  • packages/core/src/__tests__/workflow-runner.test.ts
  • packages/core/src/__tests__/yaml-validation.test.ts
  • packages/core/src/agent-handle.ts
  • packages/core/src/builder.ts
  • packages/core/src/channel-messenger.ts
  • packages/core/src/cli-registry.ts
  • packages/core/src/integrations/browser.ts
  • packages/core/src/integrations/github.ts
  • packages/core/src/integrations/slack.ts
  • packages/core/src/process-spawner.ts
  • packages/core/src/proxy-env.ts
  • packages/core/src/run.ts
  • packages/core/src/runner.ts
  • packages/core/vitest.config.ts
  • packages/github-primitive/README.md
  • packages/github-primitive/package.json
  • packages/slack-primitive/package.json

Comment on lines +3 to +5
> Status: **proposed, not started.** This plan was produced from a read-only
> investigation of `packages/core` and the relay monorepo HEAD (all `@agent-relay/*`
> at `8.0.4` source / `8.1.2` npm). Review before any code is written.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Update plan status/version references to reflect the merged v8.2 migration state.

Line 3 (“proposed, not started”) and Line 5 (8.0.4/8.1.2) now conflict with the current repo state (core already migrated and dependencies at ^8.2.0). This makes the plan misleading for follow-up work and backport decisions.

Also applies to: 167-181

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/sdk-v8-migration-plan.md` around lines 3 - 5, Update the migration
plan's metadata to reflect the merged v8.2 state: change the status text on the
current "Status: proposed, not started." line to indicate the migration is
complete (e.g., "Status: merged / completed") and update all version references
currently showing "8.0.4" and "8.1.2" to the current dependency range "^8.2.0";
apply the same edits to the other occurrences referenced (including the block
around lines 167-181) so the document consistently reflects core being migrated
and relay deps at ^8.2.0.

Comment on lines +41 to +43
| `addListener('workerOutput'\|'messageReceived'\|'agentSpawned'\|'agentReleased'\|'agentExited'\|'agentIdle'\|'deliveryUpdate', …)` | same `addListener<K extends keyof HarnessDriverEvents>(event, handler)` — **same event names** | ✅ verbatim |
| `workerOutput {name, chunk}` / `messageReceived {eventId,from,to,text,threadId}` / `agentIdle {name,idleSecs}` payloads | `WorkerOutputPayload` / `DriverMessage` / `AgentIdlePayload` — **same field names** | ✅ verbatim |
| `onBrokerStderr(cb)` | `onStderr` **construction option** | ✅ moved to ctor |

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Correct the event-surface guidance: direct clients consume BrokerEvent via onEvent, not the named/session listener mapping shown here.

The current runner integration is built around onEvent(BrokerEvent) dispatch (not the session-style message.created/terminal.output remap in this table). Please update this section to avoid steering future migrations toward a non-working listener model.

Also applies to: 121-133

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/sdk-v8-migration-plan.md` around lines 41 - 43, The event-surface
guidance is wrong: update the table entries that show session-style named
listeners (e.g., the `addListener('workerOutput'|'messageReceived'|...)` row) to
state that direct clients consume a unified BrokerEvent via
`onEvent(BrokerEvent)` instead of remapped session listeners; also clarify that
`addListener<K extends keyof HarnessDriverEvents>(event, handler)` remains for
driver-side hooks but is not the client-facing model, and change the
`onBrokerStderr(cb)` note to reflect the `onStderr` construction option; apply
the same corrections to the duplicated section later in the doc where the
session listener mapping is repeated.

Comment on lines +94 to +101
### WS-1 — AgentRelay construction & options (`runner.ts:3145`, types in `builder.ts:4`, `run.ts:1`)
- v7 `new AgentRelay({ brokerName, channels, env, requestTimeoutMs })` →
v8 `new AgentRelay({ workspaceKey, baseUrl, retryPolicy, harness })`.
- `env` / `brokerName` / `requestTimeoutMs` are gone from options. Decide where
each goes: `env` → harness-driver `SpawnRuntimeInput`; `brokerName` → broker
transport config; `requestTimeoutMs` → `retryPolicy`.
- Update the `AgentRelayOptions` type references in `builder.ts`, `run.ts`,
`runner.ts:481`.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

WS-1/WS-4 still describe superseded AgentRelay/messaging APIs; align these steps to HarnessDriverClient lifecycle APIs or mark them as historical.

As written, these sections conflict with the implemented runner direction and can cause incorrect follow-on edits in runner.ts/tests. A brief “historical draft vs final approach” note would prevent confusion.

Also applies to: 140-144

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/sdk-v8-migration-plan.md` around lines 94 - 101, The docs sections
(WS-1/WS-4) still describe the old AgentRelay/messaging APIs and conflict with
the implemented runner direction; update these sections to either (a) replace
all references to AgentRelay option changes with the actual HarnessDriverClient
lifecycle APIs (and update mentions in builder.ts, run.ts, runner.ts references)
or (b) clearly mark the AgentRelay text as a historical draft and add a short
“historical vs final approach” note that points to HarnessDriverClient;
specifically revise the wording around AgentRelay, AgentRelayOptions, and where
env/brokerName/requestTimeoutMs map (now handled via
HarnessDriverClient/SpawnRuntimeInput, broker transport config, and retryPolicy)
and ensure the cross-references in builder.ts, run.ts, and runner.ts are
consistent with the final approach.

Comment thread packages/core/src/__tests__/resume-fallback.test.ts Outdated
Comment on lines +121 to +122
const { WorkflowRunner } = await import('../runner.js');
const { workflow } = await import('../builder.js');

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check runtime contract vs current test mock style.
rg -n "HarnessDriverClient|onEvent\\(" packages/core/src/runner.ts -C2
rg -n "vi\\.mock\\('../relay\\.js'|addListener\\(|vi\\.mock\\('`@agent-relay/harness-driver`'|onEvent\\()" \
  packages/core/src/__tests__/start-from.test.ts -C2

Repository: AgentWorkforce/relayflows

Length of output: 1456


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# 1) Confirm runner.ts uses harness-driver client + onEvent broker events
rg -n "import \\{ HarnessDriverClient \\} from '`@agent-relay/harness-driver`'|\\.onEvent\\(" packages/core/src/runner.ts -C2

# 2) Inspect start-from.test.ts around the shown import lines
sed -n '90,170p' packages/core/src/__tests__/start-from.test.ts | nl -ba | sed -n '1,120p'

# 3) Find mocks/doubles in start-from.test.ts for relay.js/addListener/harness-driver usage
rg -n "vi\\.mock\\(|\\baddListener\\b|`@agent-relay/harness-driver`|onEvent\\(" packages/core/src/__tests__/start-from.test.ts -C2

Repository: AgentWorkforce/relayflows

Length of output: 802


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="packages/core/src/__tests__/start-from.test.ts"

# 1) Show the exact area around the reported dynamic imports (approx.)
sed -n '105,145p' "$FILE" | cat -n

# 2) Look for relay.js / addListener mocks in the test (separately, to avoid regex errors)
rg -n "vi\\.mock" "$FILE"
rg -n "addListener" "$FILE"
rg -n "relay\\.js" "$FILE"
rg -n "`@agent-relay/harness-driver`" "$FILE"

# 3) Confirm runner mock usage style via onEvent / HarnessDriverClient references in the test
rg -n "onEvent\\(" "$FILE"
rg -n "HarnessDriverClient" "$FILE"

Repository: AgentWorkforce/relayflows

Length of output: 1924


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Inspect runner.ts dependencies/migration status
rg -n "from '\\.\\/relay\\.js'|from \"\\.\\/relay\\.js\"|\\bAgentRelay\\b|addListener" packages/core/src/runner.ts
rg -n "HarnessDriverClient|onEvent\\(" packages/core/src/runner.ts

# Inspect the start-from test mocks and whether it references onEvent/harness-driver
FILE="packages/core/src/__tests__/start-from.test.ts"
rg -n "HarnessDriverClient|`@agent-relay/harness-driver`|onEvent\\(" "$FILE"
sed -n '60,140p' "$FILE" | cat -n
sed -n '140,240p' "$FILE" | cat -n

Repository: AgentWorkforce/relayflows

Length of output: 212


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# runner.ts: look for legacy relay.js/AgentRelay/addListener references
rg -n "relay\\.js" packages/core/src/runner.ts
rg -n "AgentRelay" packages/core/src/runner.ts
rg -n "addListener" packages/core/src/runner.ts

# runner.ts: harness-driver contract references
rg -n "`@agent-relay/harness-driver`|HarnessDriverClient|onEvent\\(" packages/core/src/runner.ts

# start-from.test.ts: harness-driver/onEvent references
FILE="packages/core/src/__tests__/start-from.test.ts"
rg -n "`@agent-relay/harness-driver`|HarnessDriverClient|onEvent\\(" "$FILE"

Repository: AgentWorkforce/relayflows

Length of output: 51


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="packages/core/src/runner.ts"
# Show runner.ts imports area (first ~80 lines)
sed -n '1,120p' "$FILE" | cat -n

# Search runner.ts for relay.js / relay identifiers in simpler way
rg -n "relay" "$FILE"
rg -n "from '\\./relay'|from '\\./relay\\.js'|from \"\\./relay\"|from \"\\./relay\\.js\"|relay\\.js" "$FILE" || true

# Find where WorkflowRunner is exported/defined
rg -n "export (class|function|const) WorkflowRunner|class WorkflowRunner|WorkflowRunner" "$FILE" | head -n 50

Repository: AgentWorkforce/relayflows

Length of output: 14764


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# 1) See if runner.ts imports the relay module that the test mocks (../relay.js => src/relay.js)
FILE="packages/core/src/runner.ts"
rg -n "from '\\.\\/relay(\\.|\\/|')|from \"\\.\\/relay(\\.|\\/|')|from '\\.\\/relay\\.js'|from \"\\.\\/relay\\.js'|from '\\.\\/relay\\.js'|from \"\\.\\/relay\\.js\"|relay\\.js" "$FILE" || true
rg -n "\\baddListener\\b|listAgentsRaw|AgentRelay|AgentRelayClient" "$FILE" || true

# 2) Inspect the test's relay mock implementation to understand what contract it provides
TEST="packages/core/src/__tests__/start-from.test.ts"
rg -n "const mockRelayInstance|vi\\.mock\\('\\.\\./relay\\.js'\\)|vi\\.mock\\(\"\\.\\./relay\\.js\"\\)|addListener|listAgentsRaw|emitRelayEvent" "$TEST"
sed -n '1,140p' "$TEST" | cat -n

# 3) Inspect the mocked relay module itself (src/relay.*) for whether it still exposes addListener
# Locate relay files
find packages/core/src -maxdepth 1 -type f -name "relay.*" -o -name "*relay*" | head -n 50

Repository: AgentWorkforce/relayflows

Length of output: 6426


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# 1) Find any import/require of relay layer from runner.ts
FILE="packages/core/src/runner.ts"
rg -n "from '\\./relay|from \"\\./relay|require\\(|\\./relay\\.js|relay\\.js" "$FILE" || true
rg -n "import\\s+\\{[^}]*\\}\\s+from\\s+'\\./relay" "$FILE" || true
rg -n "AgentRelay" "$FILE" || true
rg -n "addListener" "$FILE" || true

# 2) Inspect runner.ts region where HarnessDriverClient.spawn + onEvent are used
rg -n "HarnessDriverClient\\.spawn\\(|this\\.relay\\.onEvent\\(" "$FILE"
# show around those lines
start=3100
end=3180
sed -n "${start},${end}p" "$FILE" | cat -n

# 3) Also check runner.ts for any "relay.js" direct usage
rg -n "new\\s+AgentRelay|AgentRelay\\(" "$FILE" || true

Repository: AgentWorkforce/relayflows

Length of output: 5105


🏁 Script executed:

#!/bin/bash
set -euo pipefail
FILE="packages/core/src/__tests__/start-from.test.ts"

# Find where workflow() is defined/used
rg -n "workflow\\(" "$FILE"
rg -n "new WorkflowRunner|WorkflowRunner\\b" "$FILE"
rg -n "startFrom|startFromStep|start-from|startFrom" "$FILE"
rg -n "AGENT_RELAY_WORKFLOW_DISABLE_RELAYCAST|requiresBroker|relaycastDisabled" "$FILE"

# Show larger windows around the first workflow() usage and runner invocation.
# (Use sed ranges based on first matches.)
FIRST_WORKFLOW_LINE=$(rg -n "workflow\\(" "$FILE" | head -n 1 | cut -d: -f1)
FIRST_RUNNER_LINE=$(rg -n "new WorkflowRunner|WorkflowRunner\\b" "$FILE" | head -n 1 | cut -d: -f1 || true)

echo "First workflow line: ${FIRST_WORKFLOW_LINE:-N/A}"
echo "First runner line: ${FIRST_RUNNER_LINE:-N/A}"

if [ -n "${FIRST_WORKFLOW_LINE:-}" ]; then
  start=$((FIRST_WORKFLOW_LINE-60)); end=$((FIRST_WORKFLOW_LINE+220))
  sed -n "${start},${end}p" "$FILE" | cat -n
fi

# Also show where tests configure step types (search for "type:" in workflow definition)
rg -n "type:\\s*'|type:\\s*\"" "$FILE" | head -n 80

Repository: AgentWorkforce/relayflows

Length of output: 1655


🏁 Script executed:

#!/bin/bash
set -euo pipefail

TEST="packages/core/src/__tests__/start-from.test.ts"
RUNNER="packages/core/src/runner.ts"

# Show workflow config definition around the reported line (~74) and the execute calls around runner.execute.
sed -n '140,240p' "$TEST" | cat -n
sed -n '240,380p' "$TEST" | cat -n

# Locate the "workflow('test')" config and show its content.
# (First show ~100 lines around that occurrence.)
LINE=$(rg -n "const config = workflow\\(" "$TEST" | head -n 1 | cut -d: -f1)
start=$((LINE-80)); end=$((LINE+200))
sed -n "${start},${end}p" "$TEST" | cat -n

# Inspect runner.ts for how `this.executor` is set/used near requiresBroker.
rg -n "this\\.executor" "$RUNNER"
# Show the block around the requiresBroker computation we already saw (~3100-3180), plus earlier few lines.
sed -n '3000,3125p' "$RUNNER" | cat -n
sed -n '3125,3205p' "$RUNNER" | cat -n

Repository: AgentWorkforce/relayflows

Length of output: 27303


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# 1) Confirm whether start-from.test.ts mocks harness-driver at all
FILE="packages/core/src/__tests__/start-from.test.ts"
rg -n "`@agent-relay/harness-driver`|HarnessDriverClient|spawnPty\\(|\\.onEvent\\(" "$FILE" || true

# 2) Inspect how workflow steps are typed when config steps omit explicit `type`
#    Search for step parsing/typing logic around `deterministic`/`worktree`/`integration`
rg -n "deterministic'|worktree'|integration'|step\\.type|type:\\s*['\"]deterministic['\"]" packages/core/src | head -n 80

# Narrow: search for where step objects are normalized/validated
rg -n "normalize.*step|resolve.*step|validate.*step|infer.*step|apply.*step" packages/core/src | head -n 80

# 3) Find where RelayYamlConfig.workflows.steps are converted into internal WorkflowStep objects with `type`
rg -n "workflows\\b|steps\\b.*type|type.*step" packages/core/src | head -n 80

Repository: AgentWorkforce/relayflows

Length of output: 24662


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Find where harness-driver is mocked anywhere in core tests
rg -n "vi\\.mock\\(['\"]`@agent-relay/harness-driver`['\"]\\)" packages/core/src/__tests__ || true
rg -n "setup.*vitest|vitest\\.setup|setupFiles|testSetup|beforeAll\\(" packages/core/src/__tests__ packages/core/src || true

# Inspect runner.ts around where it throws AgentRelay not initialized and any other relay.js usage
rg -n "AgentRelay not initialized|AgentRelay" packages/core/src/runner.ts
sed -n '6560,6725p' packages/core/src/runner.ts | cat -n

# Inspect relay.js module itself for any legacy addListener API
RELAY="packages/core/src/relay.ts"
if [ -f "$RELAY" ]; then
  sed -n '1,220p' "$RELAY" | cat -n
else
  RELAYJS="packages/core/src/relay.js"
  if [ -f "$RELAYJS" ]; then
    sed -n '1,220p' "$RELAYJS" | cat -n
  fi
fi

Repository: AgentWorkforce/relayflows

Length of output: 7928


Fix start-from.test.ts relay mocking to match WorkflowRunner’s harness-driver contract

The test still mocks legacy ../relay.js/addListener behavior, but WorkflowRunner now creates the broker with HarnessDriverClient.spawn(...) and consumes broker events via this.relay.onEvent(...). There is no vi.mock('@agent-relay/harness-driver') in this test, so the start-from path isn’t exercising the same broker lifecycle APIs as production.

const { WorkflowRunner } = await import('../runner.js');
const { workflow } = await import('../builder.js');
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/core/src/__tests__/start-from.test.ts` around lines 121 - 122, The
test currently mocks the legacy relay API but not the new harness-driver
contract used by WorkflowRunner; update start-from.test.ts to mock
'`@agent-relay/harness-driver`' so that HarnessDriverClient.spawn(...) is replaced
with a test stub that returns an object matching the broker used by
WorkflowRunner (i.e., implement spawn to return an object with an onEvent method
and any other methods WorkflowRunner expects), remove or stop relying on legacy
../relay.js/addListener mocks, and ensure the test triggers relay events by
calling the stubbed broker.onEvent handlers so the start-from path exercises the
real broker lifecycle APIs used by WorkflowRunner and this.relay.onEvent.

Comment thread packages/core/src/__tests__/workflow-runner.test.ts
Comment on lines 4342 to 4346
getFailureResult: (error) => ({
status: 'failed',
output: '',
error: error instanceof Error ? error.message : String(error),
error: (() => { if (error instanceof Error && /reading 'get'/.test(error.message)) console.error('DEBUG_STACK>>>', error.stack); return error instanceof Error ? error.message : String(error); })(),
retries: state.row.retryCount,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Remove the unconditional stack dump from failure serialization.

These two getFailureResult() branches print error.stack whenever the message matches reading 'get', regardless of RF_DEBUG_STACK. That leaks internal traces into normal runs and makes logging depend on error text. Reuse the existing env-gated debug logging in the surrounding catch paths instead.

Also applies to: 4408-4411

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/core/src/runner.ts` around lines 4342 - 4346, The getFailureResult()
branches are unconditionally printing error.stack when the message matches
/reading 'get'/ which leaks stacks; remove the console.error('DEBUG_STACK>>>',
error.stack) calls inside those inline error serialization blocks (the
getFailureResult implementations) and instead rely on the existing
RF_DEBUG_STACK gated debug logging used in the surrounding catch handlers; if
you need to preserve stack output here, guard it with the same env check
(process.env.RF_DEBUG_STACK) or call the centralized debug logger rather than
printing unconditionally. Also remove the identical unconditional stack dump in
the second failure-serialization branch referenced in the comment (the other
getFailureResult-like block) so both locations use the env-gated debug path.

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6 issues found across 32 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/core/src/runner.ts">

<violation number="1" location="packages/core/src/runner.ts:4345">
P2: Leftover debug instrumentation logs stack traces from failure paths; remove the inline `console.error` side effect from error serialization.</violation>
</file>

<file name="docs/sdk-v8-migration-plan.md">

<violation number="1" location="docs/sdk-v8-migration-plan.md:3">
P3: Document status header says 'proposed, not started' but WS-0 is marked 'done & verified' and the PR description confirms WS-0 work (vendored CLI utils, strip-ansi rename) was completed. The header is misleading about actual progress.</violation>

<violation number="2" location="docs/sdk-v8-migration-plan.md:94">
P3: This workstream section still documents a migration path around `AgentRelay`, but the implementation has already moved to `HarnessDriverClient`. Update WS-1/WS-4 (or mark them historical) so follow-up edits don’t target removed APIs.</violation>

<violation number="3" location="docs/sdk-v8-migration-plan.md:126">
P2: WS-3 event mapping table targets the wrong API surface, contradicting Section 2's corrected finding. Section 2 establishes that the correct migration target is HarnessDriverClient.addListener with verbatim event names ('workerOutput', 'messageReceived', etc.), not @agent-relay/sdk messaging events ('message.created', 'terminal.output'). Following WS-3 as written would re-implement the large rewrite that Section 2 explicitly rejects.</violation>
</file>

Tip: cubic can generate docs of your entire codebase and keep them up to date. Try it here.

Re-trigger cubic

status: 'failed',
output: '',
error: error instanceof Error ? error.message : String(error),
error: (() => { if (error instanceof Error && /reading 'get'/.test(error.message)) console.error('DEBUG_STACK>>>', error.stack); return error instanceof Error ? error.message : String(error); })(),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Leftover debug instrumentation logs stack traces from failure paths; remove the inline console.error side effect from error serialization.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packages/core/src/runner.ts, line 4345:

<comment>Leftover debug instrumentation logs stack traces from failure paths; remove the inline `console.error` side effect from error serialization.</comment>

<file context>
@@ -4397,7 +4342,7 @@ export class WorkflowRunner {
         status: 'failed',
         output: '',
-        error: error instanceof Error ? error.message : String(error),
+        error: (() => { if (error instanceof Error && /reading 'get'/.test(error.message)) console.error('DEBUG_STACK>>>', error.stack); return error instanceof Error ? error.message : String(error); })(),
         retries: state.row.retryCount,
         exitCode: lastExitCode,
</file context>

@@ -0,0 +1,181 @@
# `@relayflows/core` — `@agent-relay/sdk` v7 → v8 migration plan

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: WS-3 event mapping table targets the wrong API surface, contradicting Section 2's corrected finding. Section 2 establishes that the correct migration target is HarnessDriverClient.addListener with verbatim event names ('workerOutput', 'messageReceived', etc.), not @agent-relay/sdk messaging events ('message.created', 'terminal.output'). Following WS-3 as written would re-implement the large rewrite that Section 2 explicitly rejects.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At docs/sdk-v8-migration-plan.md, line 126:

<comment>WS-3 event mapping table targets the wrong API surface, contradicting Section 2's corrected finding. Section 2 establishes that the correct migration target is HarnessDriverClient.addListener with verbatim event names ('workerOutput', 'messageReceived', etc.), not @agent-relay/sdk messaging events ('message.created', 'terminal.output'). Following WS-3 as written would re-implement the large rewrite that Section 2 explicitly rejects.</comment>

<file context>
@@ -0,0 +1,181 @@
+
+| v7 listener (fields) | v8 source |
+|---|---|
+| `workerOutput` `{name, chunk}` (3158) | session event `terminal.output`/`transcript.chunk` → `event.text`/`chunk` |
+| `messageReceived` `{eventId,from,to,text,threadId}` (3206) | `addListener('message.created')` → `event.message.text`, `event.envelope.from/to`, `event.message.parentId` |
+| `agentSpawned` `{name,runtime}` (3254) | session `status.active` / spawn ack from driver runtime |
</file context>

Comment thread packages/core/src/__tests__/start-from.test.ts Outdated
Comment thread packages/core/src/__tests__/resume-fallback.test.ts Outdated
@@ -0,0 +1,181 @@
# `@relayflows/core` — `@agent-relay/sdk` v7 → v8 migration plan

> Status: **proposed, not started.** This plan was produced from a read-only

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3: Document status header says 'proposed, not started' but WS-0 is marked 'done & verified' and the PR description confirms WS-0 work (vendored CLI utils, strip-ansi rename) was completed. The header is misleading about actual progress.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At docs/sdk-v8-migration-plan.md, line 3:

<comment>Document status header says 'proposed, not started' but WS-0 is marked 'done & verified' and the PR description confirms WS-0 work (vendored CLI utils, strip-ansi rename) was completed. The header is misleading about actual progress.</comment>

<file context>
@@ -0,0 +1,181 @@
+# `@relayflows/core` — `@agent-relay/sdk` v7 → v8 migration plan
+
+> Status: **proposed, not started.** This plan was produced from a read-only
+> investigation of `packages/core` and the relay monorepo HEAD (all `@agent-relay/*`
+> at `8.0.4` source / `8.1.2` npm). Review before any code is written.
</file context>
Suggested change
> Status: **proposed, not started.** This plan was produced from a read-only
> Status: **in progress.** WS-0 (utility vendoring) complete; remaining workstreams WS-1 through WS-5 still to be sequenced.

@@ -0,0 +1,181 @@
# `@relayflows/core` — `@agent-relay/sdk` v7 → v8 migration plan

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3: This workstream section still documents a migration path around AgentRelay, but the implementation has already moved to HarnessDriverClient. Update WS-1/WS-4 (or mark them historical) so follow-up edits don’t target removed APIs.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At docs/sdk-v8-migration-plan.md, line 94:

<comment>This workstream section still documents a migration path around `AgentRelay`, but the implementation has already moved to `HarnessDriverClient`. Update WS-1/WS-4 (or mark them historical) so follow-up edits don’t target removed APIs.</comment>

<file context>
@@ -0,0 +1,181 @@
+- Move `BrokerEvent` / `AgentSpawner` type imports to `@agent-relay/harness-driver`
+  (`runner.ts:29,126`).
+
+### WS-1 — AgentRelay construction & options (`runner.ts:3145`, types in `builder.ts:4`, `run.ts:1`)
+- v7 `new AgentRelay({ brokerName, channels, env, requestTimeoutMs })` →
+  v8 `new AgentRelay({ workspaceKey, baseUrl, retryPolicy, harness })`.
</file context>

Reconcile the @relayflows/core test suite to the migrated runner and the
bumped @agent-relay 8.x dependencies. Treats the runner/SDK as source of
truth and updates test expectations to match.

Test infrastructure:
- Cap the vitest fork pool (maxForks: 4) so workflow tests that spawn child
  processes can't overwhelm the host.
- Add a Vite plugin to externalize the `node:sqlite` builtin, which Vite's
  resolver does not yet recognize (collectors load it at import time).

Eliminate real broker/PTY spawning in tests:
- start-from / resume-fallback mocked stale module paths (`@relaycast/sdk`,
  `../relay.js`) that the migrated runner no longer imports, so execute()
  hit the real HarnessDriverClient and spawned live runtimes. Switch both to
  the runner's `executor` injection seam — no broker init, no spawn.
- error-scenarios: drop a dead `vi.mock('@agent-relay/sdk/relay')` whose
  subpath was removed in sdk v8 (it failed the whole suite at load); the
  covered tests only exercise validation/early-reject paths.

Reconcile to current runtime behavior:
- Handle the SpawnedAgentHandle `{ reason }` contract in inline custom mock
  handles (raw reason strings destructured to undefined → undetected
  timeouts / hangs).
- Opt failure-path supervised/owner-timeout tests into
  errorHandling.strategy:'fail-fast' so they exercise first-pass failures
  instead of the runner's auto-enabled repair/retry loop (was hanging to the
  30s timeout).
- agent-trajectories v0.6 moved the default data dir to
  `.agentworkforce/trajectories` (per-id subdirs); update trajectory readers.
- @agent-relay/cloud v8 moved the permission-audit path and now grants write
  on all non-denied files under readwrite access; update expectations and
  convert provisioner-audit from node:test to vitest.
- @relayfile/sdk v0.8 only removes mount points it created itself; let the
  SDK own the mount so stop() exercises cleanup.

run-script.ts: validate file extension before existence so an unsupported
file type is reported as such regardless of whether the path exists.

Full core suite: 769 passing. Typecheck and build green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@codeant-ai

codeant-ai Bot commented Jun 5, 2026

Copy link
Copy Markdown

CodeAnt AI is reviewing your PR.

@codeant-ai codeant-ai Bot added the size:XXL This PR changes 1000+ lines, ignoring generated files label Jun 5, 2026
@codeant-ai

codeant-ai Bot commented Jun 5, 2026

Copy link
Copy Markdown

CodeAnt AI finished reviewing your PR.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/core/src/__tests__/idle-nudge.test.ts (1)

168-175: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add eventListeners.clear() to prevent test pollution.

The eventListeners Set is defined at module scope (Line 80) and accumulates callbacks via onEvent registration, but it's never cleared between tests. This can cause event listeners from one test to leak into subsequent tests.

Other test files with the same pattern (e2e-owner-review.test.ts, e2e-permissions.test.ts, permissions-integration.test.ts) all clear the Set in beforeEach.

🧹 Proposed fix
 beforeEach(() => {
   vi.clearAllMocks();
+  eventListeners.clear();
   db = makeDb();
   runner = new WorkflowRunner({ db, workspaceId: 'ws-test' });
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/core/src/__tests__/idle-nudge.test.ts` around lines 168 - 175, The
module-level Set eventListeners is accumulating callbacks across tests; add a
call to eventListeners.clear() inside the beforeEach block in this test file so
the Set is reset between tests. Locate the beforeEach that creates db, runner
and mocks (uses waitForExitFn and waitForIdleFn) and insert
eventListeners.clear() at the start of that beforeEach to prevent listener
leakage from prior tests (mirrors the pattern used in e2e-owner-review.test.ts
and others).
🧹 Nitpick comments (1)
packages/core/src/__tests__/verification-custom.test.ts (1)

196-200: ⚡ Quick win

Add eventListeners.clear() to prevent listener accumulation.

The beforeEach hook clears mocks but does not clear the eventListeners Set. Other test files in this PR (verification-traceback.test.ts line 317, workflow-runner.test.ts line 291) clear eventListeners in their beforeEach hooks. While this test may not actively emit events, clearing the set ensures consistency and prevents potential test pollution.

🧹 Suggested cleanup
  beforeEach(() => {
    vi.clearAllMocks();
    queuedSubprocessResults = [];
+   eventListeners.clear();
  });
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/core/src/__tests__/verification-custom.test.ts` around lines 196 -
200, The beforeEach hook in the 'custom verification' test clears mocks and
resets queuedSubprocessResults but does not clear the eventListeners Set, which
may cause listener accumulation across tests; update the beforeEach block to
call eventListeners.clear() (referencing the eventListeners Set and the
beforeEach function in this test) so event listeners are cleared before each
test run.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@packages/core/src/__tests__/idle-nudge.test.ts`:
- Around line 168-175: The module-level Set eventListeners is accumulating
callbacks across tests; add a call to eventListeners.clear() inside the
beforeEach block in this test file so the Set is reset between tests. Locate the
beforeEach that creates db, runner and mocks (uses waitForExitFn and
waitForIdleFn) and insert eventListeners.clear() at the start of that beforeEach
to prevent listener leakage from prior tests (mirrors the pattern used in
e2e-owner-review.test.ts and others).

---

Nitpick comments:
In `@packages/core/src/__tests__/verification-custom.test.ts`:
- Around line 196-200: The beforeEach hook in the 'custom verification' test
clears mocks and resets queuedSubprocessResults but does not clear the
eventListeners Set, which may cause listener accumulation across tests; update
the beforeEach block to call eventListeners.clear() (referencing the
eventListeners Set and the beforeEach function in this test) so event listeners
are cleared before each test run.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: b0e9649e-4533-41fa-ac86-046113eaa0ca

📥 Commits

Reviewing files that changed from the base of the PR and between dda4aaf and 4c32182.

📒 Files selected for processing (22)
  • packages/core/src/__tests__/budget-enforcement.test.ts
  • packages/core/src/__tests__/completion-pipeline.test.ts
  • packages/core/src/__tests__/e2big-and-verify.test.ts
  • packages/core/src/__tests__/e2e-owner-review.test.ts
  • packages/core/src/__tests__/e2e-permissions.test.ts
  • packages/core/src/__tests__/error-scenarios.test.ts
  • packages/core/src/__tests__/idle-nudge.test.ts
  • packages/core/src/__tests__/permissions-integration.test.ts
  • packages/core/src/__tests__/provisioner-audit.test.ts
  • packages/core/src/__tests__/provisioner-mount.test.ts
  • packages/core/src/__tests__/resume-fallback.test.ts
  • packages/core/src/__tests__/run-summary-table.test.ts
  • packages/core/src/__tests__/start-from.test.ts
  • packages/core/src/__tests__/step-cwd.test.ts
  • packages/core/src/__tests__/verification-custom.test.ts
  • packages/core/src/__tests__/verification-traceback.test.ts
  • packages/core/src/__tests__/workflow-runner.test.ts
  • packages/core/src/__tests__/workflow-trajectory.test.ts
  • packages/core/src/process-spawner.ts
  • packages/core/src/run-script.ts
  • packages/core/vitest.config.ts
  • vitest.config.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • packages/core/src/tests/error-scenarios.test.ts
  • packages/core/src/tests/completion-pipeline.test.ts

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 22 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="vitest.config.ts">

<violation number="1" location="vitest.config.ts:19">
P3: `load('node:sqlite')` is dead code because `resolveId` always marks the module as external.</violation>
</file>

<file name="packages/core/src/process-spawner.ts">

<violation number="1" location="packages/core/src/process-spawner.ts:82">
P1: Completion detection uses reordered stdout/stderr aggregation, which can trigger false early-success termination.</violation>
</file>

Reply with feedback, questions, or to request a fix.

Re-trigger cubic


const notify = () => {
if (onChunk && !settled) {
onChunk(`${stdout.join('')}${stderr.join('')}`);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Completion detection uses reordered stdout/stderr aggregation, which can trigger false early-success termination.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packages/core/src/process-spawner.ts, line 82:

<comment>Completion detection uses reordered stdout/stderr aggregation, which can trigger false early-success termination.</comment>

<file context>
@@ -68,18 +68,29 @@ export function spawnProcess(command: string[], options: SpawnOptions): ChildPro
 
+    const notify = () => {
+      if (onChunk && !settled) {
+        onChunk(`${stdout.join('')}${stderr.join('')}`);
+      }
+    };
</file context>

Comment thread vitest.config.ts
enforce: 'pre',
resolveId(id) {
if (id === 'node:sqlite' || id === 'sqlite') {
return { id: 'node:sqlite', external: true };

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3: load('node:sqlite') is dead code because resolveId always marks the module as external.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At vitest.config.ts, line 19:

<comment>`load('node:sqlite')` is dead code because `resolveId` always marks the module as external.</comment>

<file context>
@@ -1,9 +1,47 @@
+    enforce: 'pre',
+    resolveId(id) {
+      if (id === 'node:sqlite' || id === 'sqlite') {
+        return { id: 'node:sqlite', external: true };
+      }
+      return null;
</file context>

@willwashburn willwashburn merged commit 4360cd7 into main Jun 5, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XXL This PR changes 1000+ lines, ignoring generated files

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant